skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Editors contains: "Schwartz, Russell"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Schwartz, Russell (Ed.)
    Computational models are complex scientific constructs that have become essential for us to better understand the world. Many models are valuable for peers within and beyond disciplinary boundaries. However, there are no widely agreed-upon standards for sharing models. This paper suggests 10 simple rules for you to both (i) ensure you share models in a way that is at least “good enough,” and (ii) enable others to lead the change towards better model-sharing practices. 
    more » « less
    Free, publicly-accessible full text available January 10, 2026
  2. Schwartz, Russell (Ed.)
    Abstract SummaryPool sequencing is an efficient method for capturing genome-wide allele frequencies from multiple individuals, with broad applications such as studying adaptation in Evolve-and-Resequence experiments, monitoring of genetic diversity in wild populations, and genotype-to-phenotype mapping. Here, we present grenedalf, a command line tool written in C++ that implements common population genetic statistics such as θ, Tajima’s D, and FST for Pool sequencing. It is orders of magnitude faster than current tools, and is focused on providing usability and scalability, while also offering a plethora of input file formats and convenience options. Availability and implementationgrenedalf is published under the GPL-3, and freely available at github.com/lczech/grenedalf. 
    more » « less
  3. Schwartz, Russell (Ed.)
    Abstract MotivationSince 2016, the number of microbial species with available reference genomes in NCBI has more than tripled. Multiple genome alignment, the process of identifying nucleotides across multiple genomes which share a common ancestor, is used as the input to numerous downstream comparative analysis methods. Parsnp is one of the few multiple genome alignment methods able to scale to the current era of genomic data; however, there has been no major release since its initial release in 2014. ResultsTo address this gap, we developed Parsnp v2, which significantly improves on its original release. Parsnp v2 provides users with more control over executions of the program, allowing Parsnp to be better tailored for different use-cases. We introduce a partitioning option to Parsnp, which allows the input to be broken up into multiple parallel alignment processes which are then combined into a final alignment. The partitioning option can reduce memory usage by over 4× and reduce runtime by over 2×, all while maintaining a precise core-genome alignment. The partitioning workflow is also less susceptible to complications caused by assembly artifacts and minor variation, as alignment anchors only need to be conserved within their partition and not across the entire input set. We highlight the performance on datasets involving thousands of bacterial and viral genomes. Availability and implementationParsnp v2 is available at https://github.com/marbl/parsnp. 
    more » « less
  4. Schwartz, Russell (Ed.)
    In 2020, the combination of police killings of unarmed Black people, including George Floyd, Breonna Taylor, and Ahmaud Arbery, and the Coronavirus Disease 2019 (COVID-19) pandemic brought about public outrage over long-standing inequalities in society. The events of 2020 ignited global attention to systemic racism and racial inequalities, including the lack of diversity, equity, and inclusion in the academy and especially in science, technology, engineering, mathematics, and medicine (STEMM) fields. Racial and ethnic diversity in graduate programs in particular warrants special attention as graduate students of color report experiencing alarming rates of racism, discrimination, microaggressions, and other exclusionary behaviors. As part of the Graduate Dean’s Advisory Council on Diversity (GDACD) at the University of California Merced, the authors of this manuscript held a year-long discussion on these issues and ways to take meaningful action to address these persistent issues of injustices. We have outlined 10 rules to help graduate programs develop antiracist practices to promote racial and ethnic justice, equity, diversity, and inclusion (JEDI) in the academy. We focus on efforts to address systemic causes of the underrepresentation and attrition of students from minoritized communities. The 10 rules are developed to allow graduate groups to formulate and implement rules and policies to address root causes of underrepresentation of minoritized students in graduate education. 
    more » « less
  5. Schwartz, Russell (Ed.)
    Science students increasingly need programming and data science skills to be competitive in the modern workforce. However, at our university (San Francisco State University), until recently, almost no biology, biochemistry, and chemistry students (from here bio/chem students) completed a minor in computer science. To change this, a new minor in computing applications, which is informally known as the Promoting Inclusivity in Computing (PINC) minor, was established in 2016. Here, we present the lessons we learned from our experience in a set of 10 rules. The first 3 rules focus on setting up the program so that it interests students in biology, chemistry, and biochemistry. Rules 4 through 8 focus on how the classes of the program are taught to make them interesting for our students and to provide the students with the support they need. The last 2 rules are about what happens “behind the scenes” of running a program with many people from several departments involved. 
    more » « less
  6. Schwartz, Russell (Ed.)
  7. Schwartz, Russell (Ed.)
    Abstract Motivation Identification and interpretation of non-coding variations that affect disease risk remain a paramount challenge in genome-wide association studies (GWAS) of complex diseases. Experimental efforts have provided comprehensive annotations of functional elements in the human genome. On the other hand, advances in computational biology, especially machine learning approaches, have facilitated accurate predictions of cell-type-specific functional annotations. Integrating functional annotations with GWAS signals has advanced the understanding of disease mechanisms. In previous studies, functional annotations were treated as static of a genomic region, ignoring potential functional differences imposed by different genotypes across individuals. Results We develop a computational approach, Openness Weighted Association Studies (OWAS), to leverage and aggregate predictions of chromosome accessibility in personal genomes for prioritizing GWAS signals. The approach relies on an analytical expression we derived for identifying disease associated genomic segments whose effects in the etiology of complex diseases are evaluated. In extensive simulations and real data analysis, OWAS identifies genes/segments that explain more heritability than existing methods, and has a better replication rate in independent cohorts than GWAS. Moreover, the identified genes/segments show tissue-specific patterns and are enriched in disease relevant pathways. We use rheumatic arthritis and asthma as examples to demonstrate how OWAS can be exploited to provide novel insights on complex diseases. Availability and implementation The R package OWAS that implements our method is available at https://github.com/shuangsong0110/OWAS. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  8. Schwartz, Russell (Ed.)
    Abstract Summary PERMANOVA (permutational multivariate analysis of variance based on distances) has been widely used for testing the association between the microbiome and a covariate of interest. Statistical significance is established by permutation, which is computationally intensive for large sample sizes. As large-scale microbiome studies, such as American Gut Project (AGP), become increasingly popular, a computationally efficient version of PERMANOVA is much needed. To achieve this end, we derive the asymptotic distribution of the PERMANOVA pseudo-F statistic and provide analytical P-value calculation based on chi-square approximation. We show that the asymptotic P-value is close to the PERMANOVA P-value even under a moderate sample size. Moreover, it is more accurate and an order-of-magnitude faster than the permutation-free method MDMR. We demonstrated the use of our procedure D-MANOVA on the AGP dataset. Availability and implementation D-MANOVA is implemented by the dmanova function in the CRAN package GUniFrac. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  9. Schwartz, Russell (Ed.)
  10. Schwartz, Russell (Ed.)
    Abstract Motivation While gene–environment (GxE) interactions contribute importantly to many different phenotypes, detecting such interactions requires well-powered studies and has proven difficult. To address this, we combine two approaches to improve GxE power: simultaneously evaluating multiple phenotypes and using a two-step analysis approach. Previous work shows that the power to identify a main genetic effect can be improved by simultaneously analyzing multiple related phenotypes. For a univariate phenotype, two-step methods produce higher power for detecting a GxE interaction compared to single step analysis. Therefore, we propose a two-step approach to test for an overall GxE effect for multiple phenotypes. Results Using simulations we demonstrate that, when more than one phenotype has GxE effect (i.e. GxE pleiotropy), our approach offers substantial gain in power (18–43%) to detect an aggregate-level GxE effect for a multivariate phenotype compared to an analogous two-step method to identify GxE effect for a univariate phenotype. We applied the proposed approach to simultaneously analyze three lipids, LDL, HDL and Triglyceride with the frequency of alcohol consumption as environmental factor in the UK Biobank. The method identified two loci with an overall GxE effect on the vector of lipids, one of which was missed by the competing approaches. Availability and implementation We provide an R package MPGE implementing the proposed approach which is available from CRAN: https://cran.r-project.org/web/packages/MPGE/index.html Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less